High Throughput Screening of Co-Expressed Gene Pairs with Controlled False Discovery Rate (FDR) and Minimum Acceptable Strength (MAS)

نویسندگان

  • Dongxiao Zhu
  • Alfred O. Hero
  • Zhaohui S. Qin
  • Anand Swaroop
چکیده

Many exploratory microarray data analysis tools such as gene clustering and relevance networks rely on detecting pairwise gene co-expression. Traditional screening of pairwise co-expression either controls biological significance or statistical significance, but not both. The former approach does not provide stochastic error control, and the later approach screens many co-expressions with excessively low correlation. We have designed and implemented a statistically sound two-stage co-expression detection algorithm that controls both statistical significance (false discovery rate, FDR) and biological significance (minimum acceptable strength, MAS) of the discovered co-expressions. Based on estimation of pairwise gene correlation, the algorithm provides an initial co-expression discovery that controls only FDR, which is then followed by a second stage co-expression discovery which controls both FDR and MAS. It also computes and thresholds the set of FDR p-values for each correlation that satisfied the MAS criterion. Using simulated data, we validated asymptotic null distributions of the Pearson and Kendall correlation coefficients and the two-stage error-control procedure; we also compared our two-stage test procedure with another two-stage test procedure using the receiver operating characteristic (ROC) curve. We then used yeast galactose metabolism data to illustrate the advantage of our method for clustering genes and constructing a relevance network. The method has been implemented in an R package "GeneNT" that is freely available from the Comprehensive R Archive Network (CRAN): www.cran.r-project.org/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel medium-throughput biological assay system for HTLV-1 infectivity and drug discovery

Objective(s): Here, a reporter cell line containing two reporter vectors were developed, in order to monitor the Human T-Lymphotropic Virus type1(HTLV-1) infectivity and the cell viability simultaneously. Materials and Methods: The reporter cell line was constructed by stably transfected baby hamster's kidney cell line (BHK-21), with the genomes expressing two different reporters in separate pl...

متن کامل

High Dimensional Variable Selection with Error Control

Background. The iterative sure independence screening (ISIS) is a popular method in selecting important variables while maintaining most of the informative variables relevant to the outcome in high throughput data. However, it not only is computationally intensive but also may cause high false discovery rate (FDR). We propose to use the FDR as a screening method to reduce the high dimension to ...

متن کامل

Resting-state Functional Connectivity During Controlled Respiratory Cycles Using Functional Magnetic Resonance Imaging

Introduction: This study aimed to assess the effect of controlled mouth breathing during the resting state using functional magnetic resonance imaging (fMRI). Methods: Eleven subjects participated in this experiment in which the controlled “Nose” and “Mouth” breathings of 6 s respiratory cycle were performed with a visual cue at 3T MRI. Voxel-wise seed-to-voxel maps and whole-brain region of i...

متن کامل

Symmetric Directional False Discovery Rate Control.

This research is motivated from the analysis of a real gene expression data that aims to identify a subset of "interesting" or "significant" genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expr...

متن کامل

A note on the false discovery rate and inconsistent comparisons between experiments

MOTIVATION The false discovery rate (FDR) has been widely adopted to address the multiple comparisons issue in high-throughput experiments such as microarray gene-expression studies. However, while the FDR is quite useful as an approach to limit false discoveries within a single experiment, like other multiple comparison corrections it may be an inappropriate way to compare results across exper...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 12 7  شماره 

صفحات  -

تاریخ انتشار 2005